2,034 research outputs found
Multilingual Unsupervised Sentence Simplification
Progress in Sentence Simplification has been hindered by the lack of
supervised data, particularly in languages other than English. Previous work
has aligned sentences from original and simplified corpora such as English
Wikipedia and Simple English Wikipedia, but this limits corpus size, domain,
and language. In this work, we propose using unsupervised mining techniques to
automatically create training corpora for simplification in multiple languages
from raw Common Crawl web data. When coupled with a controllable generation
mechanism that can flexibly adjust attributes such as length and lexical
complexity, these mined paraphrase corpora can be used to train simplification
systems in any language. We further incorporate multilingual unsupervised
pretraining methods to create even stronger models and show that by training on
mined data rather than supervised corpora, we outperform the previous best
results. We evaluate our approach on English, French, and Spanish
simplification benchmarks and reach state-of-the-art performance with a totally
unsupervised approach. We will release our models and code to mine the data in
any language included in Common Crawl
Teaching digital and global law for digital and global students: creating students as producers in a Hong Kong Internet Law class
In an increasingly globalised and digitalised society and economy, legal education needs to foster a different skill set among students from that taught traditionally. Law students need practice in responding to a variety of scenarios and contexts, as well as developing creative and critical thinking skills. The "student as producer" approach provides opportunities for students to build such skills by having students produce work that could benefit their fellow classmates and future cohorts, and contribute to the discipline's knowledge base. We present a case study of a final year undergraduate law course, Internet and the Law, at the Chinese University of Hong Kong where we used the student as producer approach, collaborated with external organisations and used digital tools to foster global and digital-savvy law students. Using a mixed-methods approach we highlight successes and limitations of using the "student as producer" approach, digital tools and an internationalised curriculum in our law classroom. Overall, students and staff found the approach successful in providing global and digital law students with practical skills. We also identified limitations and challenges to be addressed in future projects. Our findings speak to broader themes of active engagement, contributions, and practical knowledge for law students in their learning and future careers
Learning to Speak and Act in a Fantasy Text Adventure Game
We introduce a large scale crowdsourced text adventure game as a research
platform for studying grounded dialogue. In it, agents can perceive, emote, and
act whilst conducting dialogue with other agents. Models and humans can both
act as characters within the game. We describe the results of training
state-of-the-art generative and retrieval models in this setting. We show that
in addition to using past dialogue, these models are able to effectively use
the state of the underlying world to condition their predictions. In
particular, we show that grounding on the details of the local environment,
including location descriptions, and the objects (and their affordances) and
characters (and their previous actions) present within it allows better
predictions of agent behavior and dialogue. We analyze the ingredients
necessary for successful grounding in this setting, and how each of these
factors relate to agents that can talk and act successfully
- …